Are Seven Words All We Need? Recognizing Genre at the Sub-Sentential Level

نویسندگان

  • Philip M. McCarthy
  • Danielle S. McNamara
چکیده

Genre recognition is a critical facet of text comprehension. In this study, we assess the minimum number of words in a sentence necessary for genre recognition to occur. Using corpora of Narrative, History, and Science sentences, we found that three experts in discourse psychology (demonstrating high agreement) accurately recognized the genre of over 80% of the sentences. This recognition generally occurred within the first seven words, with the highest accuracy for the Narrative genre. Thus, even very short and incomplete text can potentially activate text-structure knowledge and facilitate comprehension. In addition, we show that Narrative-like sentences are the most pervasive sentence type, with expert raters assigning 51% of misclassified sentences to the Narrative genre (again with high agreement between raters). In contrast, only 11% of misclassified sentences were assigned to Science. This study allows us to establish baseline expectations for skilled readers so that we can further examine differences in speed and accuracy of genre recognition as a function of reading skill.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Are Three Words All We Need? Recognizing Genre at the Sub-Sentential Level

Genre identification is a critical facet of text comprehension, but very little is known about the process and information constraints of classifying texts by genres. In this study, higherskill and lower-skill participants read 210 sentences from three genres. The words in the sentences were presented sequentially, one at a time. With each new word, participants decided whether the sentences ca...

متن کامل

A Psychological and Computational Study of Sub-Sentential Genre Recognition

Abstract Genre recognition is a critical facet of text comprehension and text classification. In three experiments, we assessed the minimum number of words in a sentence needed for genre recognition to occur, the distribution of genres across text, and the relationship between reading ability and genre recognition. We also propose and demonstrate a computational model for genre recognition. Usi...

متن کامل

First Language Activation during Second Language Lexical Processing in a Sentential Context

 Lexicalization-patterns, the way words are mapped onto concepts, differ from one language      to another. This study investigated the influence of first language (L1) lexicalization patterns on the processing of second language (L2) words in sentential contexts by both less proficient and more proficient Persian learners of English. The focus was on cases where two different senses of a polys...

متن کامل

Promotion of Self in an Other-Oriented Academic Sub-Genre: The Case of Self-Mention in Acknowledgments

Although sometimes considered to act only as a means of recognizing debts, acknowledgments give the opportunity for writers to display a self-conscious and reflective representation of self. Following this assumption and to reveal some of the ways this is achieved, a corpus of 80 textbook acknowledgments in the field of Linguistics and Applied Linguistics were analyzed in order to show what “se...

متن کامل

Discourse for Machine Translation

Statistical Machine Translation is a modern success: Given a source language sentence, SMT finds the most probable target language sentence, based on (1) properties of the source; (2) probabilistic source--target mappings at the level of words, phrases and/or sub-structures; and (3) properties of the target language. SMT translates individual sentences because the search space even for a single...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007